AWS EMR | 4 to 9 Years | PAN India
Capgemini
2 - 5 years
Chennai
Posted: 11/19/2024
Job Description
Job Description
• Setting up and managing EMR clusters for processing large-scale data using frameworks like Apache Hadoop, Apache Spark, Apache Hive, etc.
• Configuring EMR clusters based on specific requirements, including choosing the appropriate instance types, storage configurations, and software settings.
• Implementing and optimizing data processing workflows on EMR clusters, leveraging distributed computing frameworks for tasks such as data cleansing, transformation, and analysis.
• Writing scripts and code to interact with EMR clusters, often using languages like Python, Java, or Scala, to develop and execute data processing jobs.
• Integrating EMR with other AWS services, such as Amazon S3 for storage, AWS Glue for ETL (Extract, Transform, Load), AWS Lambda and other complementary services to create end-to-end data pipelines.
• Optimizing cluster performance by fine-tuning configurations, adjusting resource allocation, and implementing best practices for efficient data processing.
• Implementing monitoring solutions to track cluster performance and troubleshoot issues, ensuring the reliability and availability of the big data processing environment.
• Implementing security measures to protect data within EMR clusters, configuring access controls, encryption, and ensuring compliance with security policies.
• Implementing automation for cluster provisioning, scaling, and decommissioning to streamline operations and improve efficiency.
• Overall, an AWS EMR role requires a combination of cloud computing knowledge, big data processing expertise, scripting/coding skills, and a good understanding of data engineering principles.
• AWS certifications, such as the "AWS Solution Architect - Associate" or relevant certifications demonstrating expertise in cloud computing, are often preferred.
• Good communication skills to interact with cross-functional teams, understand business requirements, and effectively convey technical information.
• Ability to collaborate with data engineers, data scientists, and other stakeholders in a team-oriented environment.
Primary Skill
AWS Infrastructure ManagementEMREMR infrastructure automation using IaC tools like Terraform Secondary SkillGood Communication SkillsCapgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fuelled by its market leading capabilities in AI, cloud and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of €22.5 billion.
About Company
Capgemini is a global leader in consulting, technology services, and digital transformation. Headquartered in Paris, France, Capgemini provides a wide range of services, including IT consulting, managed services, business process outsourcing, and digital transformation solutions. With over 360,000 employees across more than 50 countries, the company focuses on helping organizations innovate and transform their businesses to remain competitive in a rapidly changing digital landscape. Capgemini is known for its expertise in cloud computing, AI, cybersecurity, and other emerging technologies, working closely with clients to develop sustainable and cutting-edge solutions.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).